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Abstract 

We derive achievable error exponents for the relay channel using the method of types. In particular, two 

block-Markov coding schemes are analyzed: partial decode-forward and compress-forward. The derivations require 

combinations of the techniques involved in the proofs of Csiszar-Korner-Marton's packing lemma for the error 

exponent of channel coding and Marton's covering lemma for the error exponent of source coding with a fidelity 

^T) ■ criterion. 



Index Terms 



o 

(N; 

^ ' Relay channel. Error exponent. Reliability function. Partial decode-forward. Compress-forward, Method of types 

a; 

< 

^ ■ I. Introduction 

We derive achievable error exponents for the discrete memoryless relay channel. This channel, introduced by van 

der Meulen in [1], is a point-to-point communication system that consists of a sender Xi, a receiver Y^ and a relay 

i—^^ ■ with input Y2 and output X2. The capacity is not known in general but there exists several coding schemes that are 

^ ' optimal for certain classes of relay channels, e.g., degraded. These coding schemes, introduced in the seminal work 

I— —I. by Cover and El Gamal [2] include decode-forward (DF), partial decode-forward (PDF) and compress-forward 

] (CF). Using PDF, the capacity of the relay channel C is lower bounded as 

^'- C>maxmm{IiXiX2;Y3),I{UY2\X2) + IiXr,Y3\X2U)} (1) 

iy~ ! where the maximization is over all Pc/XiXa- DF is a special case of PDF in which U = Xi and instead of decoding 

cn ] part of the message as in PDF, the relay decodes the entire message. In CF, a more complicated coding scheme, 

■t^ ■ the relay sends a description of I2 to the receiver It uses Y^ as side information a la Wyner-Ziv [3, Ch. 11] to 

O ' reduce the rate of the description. One form of the CF lower bound writes [4] 

m. 

TH; C>ms.xmm{I{Xi;Y2Y3\X2),I{XiX2;Y3)-I{Y2;Y2\XiX2Ys)} (2) 

>■ 
• ^H ■ where the maximization is over Px^ , Px2 ^^d Py \x y ■ Both PDF and CF involve block-Markov coding [2] in 

rS '. which the channel is used N = nb times over b blocks, each involving an independent message to be sent and the 

C^ ■ relay codeword in block j depends statistically on the message from block j — I. 

In addition to capacities, in information theory, error exponents are also of tremendous interest. They quantify 

the exponential rate of decay of the error probability when the rate of the code is below capacity or the set of 

rates is strictly within the capacity region. Such results allow us to provide approximate bounds on the blocklength 

needed to achieve a certain rate (or set of rates) and so provide a means to understand the tradeoff between rate(s) 

and error probability. In this paper, we derive achievable error exponents for various well-known coding schemes 

for the relay channel including PDF and CF 

A. Main Contributions 

Our two main contributions here are the derivations of error exponents for PDF and CF. For PDF, which contains 
DF as a special case, by using maximum mutual information (MMI) decoding [5], [6], we show that the analogue 
of the random coding error exponent (i.e., an error exponent that is similar in style to the one presented in [6, 
Thm. 10.2]) is universally attainable. That is, the decoder does not need to know the channel statistics. This is in 
contrast to the recent work by Bradford-Laneman [7] in which the authors employed maximum-likelihood decoding 
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with the sUding window decoding technique of Kramer-Gastpar-Gupta [8]. In [7], the channel needs to be known 
at the decoder. To prove this result, we generalize the techniques used to prove the packing lemmas in [6], [9], 
[10] so that they are applicable to the relay channel. 

For CF, we draw inspiration from [11] which derives achievable error exponents for Wyner-Ahlswede-Komer 
coding and Wyner-Ziv coding. We handle the combination of covering and packing in a similar way as [11] in order 
to derive an achievable error exponent for CF. In addition, a key technical contribution is the taking into account 
of the conditional correlation between Y2 and Xi (given X2) using a technique introduced by Scarlett-Guillen i 
Fabregas [12]. 

B. Related Work 

The work that is most closely related to this paper is that by Bradford-Laneman [7] who derived the random 
coding error exponent for DF based on Gallager's Chernoff-bounding techniques [13]. We generalize their result 
to PDF and we use MMI decoding. For PDF, our work leverages on techniques used to prove various forms of the 
packing lemmas for multiuser channels in, for example, in the monograph by Haroutunian et al. [10]. It also uses a 
change-of-measure technique introduced by Hayashi [14] for proving second-order coding rates in channel coding. 

For CF, since it is related to Wyner-Ziv, we leverage on the work of Kelly -Wagner [11] who derived an achievable 
exponent for Wyner-Ziv and Wyner-Ahlswede-Korner coding. In a similar vein, Moulin-Wang [15] and Dasarthy- 
Draper [16] derived lower bounds for the error exponents of Gel'fand-Pinsker coding and content identification 
respectively. We also note that Ngo et al. [17] presented an achievable error exponent for amplify-forward for the 
AWGN relay channel but does not take into account block-Markov coding [2]. 

C Structure of Paper 

This paper is structured as follows. In Section II, we state our notation, some standard results from the method 
of types [6], [18] and the definitions of the discrete memory less relay channel, block-Markov coding and error 
exponents. In Section III, we state and prove error exponent theorems for PDF. In Section IV, we state and prove 
an error exponent theorem for CF. In these two sections, the proofs of the theorems are provided in the final 
subsections (Subsections III-C and IV-D) and can be omitted at a first reading. We conclude our discussion in 
Section V where we also state further avenues of research. 

II. Preliminaries 
A. Notation and The Method of Types 

We generally adopt the notation from Csiszar and Korner [6] with a few modifications. Random variables (e.g., 
X) and their realizations (e.g., x) are in capital and small letters respectively. All random variables take values on 
finite sets, denoted in calligraphic font (e.g., X). For a sequence x"^ = (xi, . . . , Xn) G Af", its type is the distribution 
P{x) = ^ Y^^=i l{^ = ^i} where Ijclause} is 1 if the clause is true and otherwise. All logs are with respect 
to base 2 and we use the notation exp(t) to mean 2*. Finally, \a\'^ := max{a, 0} and [a] :={!,..., \a\} for any 

The set of distributions supported on X is denoted as ^{X). The set of types in l3^{X) with denominator n 
is denoted as ^„(Af). The set of all sequences x" of type P is the type class of P and is denoted as Tp := 
{x" € X^ : x" has type P}. For a distribution P G J^{X) and a stochastic matrix V : X ^ y, we denote the 
joint distribution interchangeably as Pxy or PV. This should be clear from the context. For x" G Tp, the set of 
sequences y"- G 3^" such that (x",y") has joint type PxV is the V -shell Tvix"^)- Let 'fn{y',P) be the family of 
stochastic matrices V : X ^ y ior which the F-shell of a sequence of type P G ^ni-^) is not empty. Information- 
theoretic quantities are denoted in the usual way. For example, I{X; Y) and I{P, V) denote the mutual information 
where the latter expression makes clear that the joint distribution of {X, y) is P x V. In addition, /(x" A y") is 
the empirical mutual information of (x",y"), i.e., if x" G Tp and y" G 7y(x"), then, /(x" A y") = I{P, V). For 
a distribution P G ^{X) and two stochastic matrices V : X ^ y,W : Xxy ^ Z, I{V, W\P) is the conditional 
mutual information I{Y;Z\X) where {X,Y,Z) is distributed as PxVxW. 

We will also often use the asymptotic notation = to denote equality to first-order in the exponent. That is, for 
two positive sequences {a„, 6„}J^i> we say that a„ = 6„ if and only if lim„_i.oo n'^ log |^ = 0. Also, we will use 
< to denote inequality to first-order in the exponent. That is, a„ < 6„ if and only if limsup^_^o(3 n~^ log ^ < 0. 
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Fig. 1. The relay channel with the notations we use in this paper 



We also summarize some known facts about types that we use use in the sequel. Fix a type P S ^'ni^X) and 
sequence x" G ^n{^X). Also fix a conditional type V G yniy'jP) and a sequence y" € Tv{x^). 

Lemma 1 (Basic Properties of Types). For any stochastic matrix W : X —> y, we have 

1) |r„(3^;P)|<(n + l)l'^ll^l 

2) (n + l)-l'^ll^lexp(nF(y|P)) < |ry(x'")| < exp(nF(y|P)) 

3) VF"(2/"|x") = e^^[-n{D{V\\W\P) + F(F|P))] 

4) (n + l)-!-^!!^! e^^[-nD{V\\W\P)] < W {Tv{x'^)\x'') < exp[-nD{V\\W\P)] 

Lemma 2 (Joint Typicality for Types I). Let P G =!^„(A'i), F G r„(A'2;P) ancf F' G r„(3^;P x V). Define 
W{y\xi) := ^^ P(xi)y(x2|xi)y(yjxi,X2). Then for any x" G 7p, «/ -'^^ w uniformly drawn from the shell 
Tv{x^) and y" is any element ofTw{xi), 



pr e Tv-W,^?)] = exp[-n/(y,y'[P)]. 



(3) 



Lemma 3 (Joint Typicality for Types II). Let P G ^n{Xi), V G 1^n{X2]P) and V G ^„(3^;P). D^-y^ne W : 
y X Xi ^ X2 be any (consistent) channel satisfying "^ W{x2\y,xi)V'{y\xi) = y(x2|xi). Fix x" G Tp. Let X2 
be uniformly distributed in Tv{xi). For any y" G 7V'(a^i). we have 



^[X^ G r^(r,x?)] < exp[-n/(y',H^|P)] 
These two lemmas are proven in Appendix A and B respectively. 



(4) 



B. The Relay Channel and Error Exponents 

In this section, we recall the definition of the relay channel and the notion of error exponents. 

Definition 1. A 3-node discrete memoryless relay channel (DM-RC) is a tuple {Xi x X2,W,y2 x 3^3) where 
"^1) X2,y2 and y^ are finite sets and W : Xi x X2 —> y2 ^ ys is a stochastic matrix. The sender (node 1) wishes 
to communicate a message M to the receiver (node 3) with the help of the relay node (node 2). See Fig. L 

Definition 2. A (2"^ 



,n, 



code for the DM-RC consists of a message set A4 






A^f 



an encoder f : A4 
that assigns a codeword to each message, a sequence of relay encoders gi : 3^2^^ — > ^'2,^ G [n] each assigning 
a symbol to each past received sequence and a decoder ip : y^ — t- M that assigns an estimate of the message to 
each channel output. The rate of this code is R. 

We assume that M is uniformly distributed on M. and the channel is memoryless. More precisely, this means 
that the current received symbols (^21, ^3 j) are conditionally independent of the message and the past symbols 
(M, X|~^,X2~^,y2~^;^3 ^) given the current transmitted symbols (Xij,X2i). The average error probability is 
P(M ^ M) where M = ip{Y^) is the estimated message. 

As in [7], for both PDF and CF, we will use block-Markov coding to send a message M representing NR^s = 
nbRcs bits of information. We use the channel N times and this total blocklength is partitioned into b correlated 
blocks each of blocklength n. The number of blocks b is fixed and regarded as a constant (does not grow with n). 
The message is split into 6—1 sub-messages Mj,j G [b — 1], each representing nR bits of information. Thus, the 
effective rate is 

i?eflf := — ; — R- (5) 



Under this coding setup, we wish to provide bounds for the reliabiUty function using both PDF and CF. 
Definition 3. The reliability function [6] for the DM-RC is defined as 

E{R) := sup i liminf logP(M / M) \ (6) 

[^ n->-oo n J 

where M ^ Ai := [2"^] and the supremum is over all sequences of (2"^, n) codes for the DM-RC (Definition 2). 
The block-Markov error exponent computed wrt b blocks (as above) is defined as 

Eb{R,s) := sup|liminf-llogP(M / M)] (7) 

where the total blocklength is N = nh, the message is M € M. := [2^^""] and the supremum is over all 
block-Markov coding schemes involving b blocks. 

This definition of Ei,{Rcs) differs from the conventional reliability functions in the literature (i.e., that in (6)) in 
that we are attaching a class of coding schemes to it-namely, block-Markov coding [2]. However, we find it more 
convenient, for the purposes of this paper, to define the error exponent as such as we will always be employing 
variations of block-Markov coding (such as PDF and CF) over a fixed number of blocks b. Of course, one is free 
to choose the exact block-Markov coding scheme for the maximization in (7). Clearly, the usual reliability function 
E{R) is always lower bounded by Ei,{Rcs) = Ei,{^R) for any 6 G N. 

In the next two sections, we will study two schemes that provide lower bounds to Ei,(Rcs). 

III. Partial Decode-Forward (PDF) 

We warm up by deriving two achievable error exponents using PDF. In PDF, the relay decodes part of the 
message in each block. For block j e [b], the part of the message that is decoded by the relay is indicated as M'- 
and the remainder of the message is A/". Thus, Mj = {M'-,M'!). We will state the main theorem in Section III-A, 
provide some remarks in Section III-B and prove it in Section III-C. 

A. Analogue of the Random Coding Error Exponent 

The analogue of the random coding exponent is presented as follows: 

Theorem 4 (Random Coding Error Exponent for Relaying). Fix 6 € N, auxiliary alphabet U and a joint distribution 

Qx2 ^ Qu\X2 ^ Qxi\ux2 ^ ^{'^2 X ^ X ^i). These distributions induce the following virtual channels: 

WY2\UX2{y2\u,X2) ■■= ^ W{y2,y3\xi,X2)Qx^\UX2{xi\u,X2), (8) 

xi,y3 

WY,\ux2iy3W^X2) ■■= ^ W{y2,y3\xi,X2)Qx^\UX2ixi\u,X2), (9) 

xi,y2 

WY^\uXiX2(.ysW,xi,X2) ■■='^W{y2,y3\xi,X2), \/ueU. (10) 



2/2 



We have the following lower bound on Ef,(^Rcs): 



EbiRes) > ^ 



max mm{F(R'),G(^R'),G(R")} 

R'+R"=R 



(11) 



where F{R'), G{R') and G{R") are constituent error exponents defined as 



F{R'):= mm D(y||W^y,|f;^JQt/xJ + |/(Q^|x., ^IQxJ - i?T (12) 

G{R') :=,^ min D{V\\Wy,\ux2\Qux2) + \l(.Qu,X2,V) - R'\-^ (13) 

GiR"):= min ^^ I)(y||TVy3|C/x,xJQc/x.xJ + |/(Qx,|C/x., ^IQc/xJ - ii'T (14) 

The proof of this result is based on a modification of the techniques used in the packing lemma [6], [9], [10] 
and is provided in Section III-C. 



B. Remarks on the Error Exponent for Partial Decode-Forward 
A few comments are in order with regard to Theorem 4. 

1) Firstly, since Qxa' Qu\X2 ^'^'^ Qx-,\UX2 ^^ ^^11 ^^ ^^^ spUtting of R into R' and R" are arbitrary, we can 
maximize the lower bounds in (11) over these free parameters. Secondly, for a fixed split R' + R" = R, if 

R' < I{U;Y2\X2) (15) 

R' < IiUX2;Ys) (16) 

R" <IiXr,Y3\UX2), (17) 

for some Qx2^ Qu\X2 ^^'^ Qxi\ux-,' then F{R'), G{R') and G{R") are positive. Hence, the error probability 
decays exponentially fast if R satisfies the PDF lower bound in (1). 

2) Secondly, note that F{R') is the error exponent for decoding a part of the message at the relay and G{R') 
and G{R") are the error exponents for decoding the rest of the message at the decoder. Setting U = Xi 
recovers DF for which the error exponent (without the sliding-window modification) is provided in [7]. 

3) The exponent in (11) demonstrates a tradeoff between the effective rate and error probability: for a fixed R, 
as the number of blocks h increases, R^q increases but the lower bound on the error exponent decreases. 
Varying R alone, of course, also allows us to observe this tradeoff. 

4) In general, the sUding window technique [8] yields potentially better exponents. We do not explore this 
extension here but note that the improvements can be obtained by using appealing to the techniques in [7, 
Props. 1 and 2]. 

5) In the proof, we use a random coding argument and show that averaged over the random code, the probability 
of error is desirably small. We do not assert the existence of a good codebook via the classical packing 
lemma [6, Lem. 10.1] nor via its variants [10]. Instead, we use a change-of-measure technique that was also 
used by Hayashi [14] for proving second-order coding rates in channel coding. 

C. Proof of Theorem 4 

Proof: We code over h blocks each of length n (block-Markov coding [2]). Fix rates R' and R" satisfying 
R' + R" = R. We fix a joint type Qx. x Qu\x2 x Qx,\ux2 ^ ^n{X2 x U x Xi) satisfying //(QxJ < R', 
H(Qu\X2\Qx2) < R' and H{Qx,\ux2\Qu\X2Qx2) < R" ■ Split each message Mj,j G [6 - 1] of rate R into two 
independent parts Af and M'- of rates R' and R" respectively. 

Generate k' = exp(ni?') sequences X2{rn'-^),m'-^ G [k'] uniformly at random from the type class Tq^^. For 
each rn'-^ G [k'] (with ttiq = 1), generate k' sequences u"'{m'Am'-^) uniformly at random from the (5f/|X2-shell 
'7Q[/|X2(^2("^j-i))- Now for every (m'_;^,m') G [k']'^, generate k" = exp(ni?") sequences Xi{m'j,m''\m'-_^) 
uniformly at random from the QxilLfXg'Shell 7q^^|[;x2(^"("^}I"^i-i)5^2 ("^j-i))- This gives a random codebook. 
We bound the error probability averaged over realizations of this random codebook. 

The sender and relay cooperate to send m' to the receiver. Encoder transmits x"(?n,'-,m"|?TT,'_^) (with ?tt,q = 
m^ = 1 by convention). 

At the j-th step, the relay does MMI decoding [5] for m' given 'rn'-_i and 2/2 0)- More precisely, it declares 
that ml- is sent if 



m;- = argmax I{u^'-{m'j\m'j_i) Ay2{j)\x2{m'j_i)). (18) 

m'e[exp(n(i?'))] 



~% 



By convention, set tjiq = 1. Recall that /(ti"(m' |m'_]^) A 2/2 (j)|a^2("T'i-i)) i^ the conditional mutual infor- 
mation I{U;Y2\X2) where the dummy random variables {X2:U,Y2) have joint type given by the sequences 
{u"'{m'A'm'- -^),X2 {m'j _i)t 1/2 ij))- After all blocks are received, the decoder performs backward decoding [19], 
[20] by using the MMI decoder [5]. In particular, it declares that m' is sent if 



1 



7h'j= argmax i{u''{m'j^^\m'j),x'^{m'j) Ay^{j)). (19) 

mje[exp(n(_R'))] 

After all rh'pj G [6 — 1] have been decoded in step (19), the decoder then decodes m'' using MMI [5] as follows: 

m"= argmax I{x'l{mj,m"\mj^i) Ay'^{j)\u'^{m'j\mj_^),X2{rh'j_i)). (20) 

m''g[exp{n(_R"))] 



In steps (18), (19) and (20) if there exists more than one message attaining the argmax, then pick any one uniformly 
at random. Note that since the code is of constant composition, MMI decoding is the same as minimum conditional 
entropy decoding [21]. We assume, by symmetry, that Mj = (M', M") = (1,1) is sent for all j G [& — 1]- The line 
of analysis in [7, Sec. Ill] yields 

P(M/M)<(6-l)(eR + eD,i + eD,2) (21) 

where for any j £ [b — I], 

eR:=P(MJ/l|MJ_i = l) (22) 

is the error probability in relay decoding and for any j G [6 — 1] , 

eD,i := P(MJ+i ^ l|^i+i = 1' ^j+i = 1)' and (23) 

eD,2 := P(M; ^ 1|MJ = 1, Mj_i = 1) (24) 

and are the error probabilities at the decoder. Since b is assumed to be constant, it does not affect the exponential 
dependence of the error probability in (21). So we just bound er, eD,i and eD,2- Since all the calculations are 
similar, we focus on eR leading to the error exponent F{R') in (12). 

An error occurs in step (18) (conditioned on neighboring blocks being decoded correctly so rh'-^ = m'-^ = 1) 
if and only if there exists some index m' / 1 such that the empirical conditional information computed with respect 
to n"(m'|l) is higher than that of ti"(l|l), i.e., 

6R = F{3m'^ + 1 : /([/"KO A ^^"(jll^a (1)) > ^^^"(1) A lT(j)l^2 (1))) (25) 

We can condition this on various values of (n",X2) G Tq^^ as follows, 

eR= E p^-^/3n(n",x?), (26) 

where 

/3„(n",x^) := P(3m;. + 1 : /([/"(7f.;.|l) A y2"(j)l^2 (1)) 

> /(^"(1|1) Ay2"(j)l^2"(l)) I (^"(1!1),X2"(1)) = K,x5)). (27) 

It can be seen that fin{u^-,xV^ does not depend on [yP'^x^) G Tq^j^ so we simply write /S^ = /3„(ti",a;2) and we 
only have to upper bound /?„. Because x"(l, 1|1) is drawn uniformly at random from Tq^ i^^ (n",X2), 

where 

> /([/"(111) A lT(i)l^2"(l)) I ([/"(1|1),X2"(1)) = iu\x^),Y,^U) = 2/2)- (29) 
We continue to bound /?„ as follows: 

/3n< E (n + l)l"ll'^^ll'^^lexp(-nF(Q^^I^^JQt/xJ)E^"(y2k?,^2)Mn(y2") (30) 

E (n + l)l"ll*ll^^lQ^^|f,^^(x^|n«,x?)Ew^"(y2l^?,x?)^n(y?) (31) 

<(„+l)MI^.|l^.|J^^„(y-)J^Q™^|^^^(^-|^n,^n)M/-(y2"K,X^) (32) 

< (n + l)l"ll'^^II^^IE^y.|f/x.(y2K,^2K(y?), (33) 



where (30) follows by lower bounding the size of Tq^ \ux {u"-,X2) (Lemma 1), (31) follows by the fact that 
the Q^^|f^^^(-|u",a;^)-probability of any sequence in Tq^^^^^^{u'^,x^) is exactly exp(-ni?(Qxi|i/xJQ6fxJ) 
(Lemma 1), (32) follows by dropping the constraint x" € Tq^ |„^^ , and (33) follows from the definition of Wy^ic/Xa 
in (8). We have to perform the calculation leading to (33) because each Xf is drawn uniformly at random from 
Tqxi\ux2 (^"' ^2) ^'i^ 'lot from the product measure Q^ ,^-^ ( ■ |n", Xg), which would simplify the calculation and 
the introduction of the product channel Wy \ux ■ This change-of-measure technique (from constant-composition to 
product) was also used by Hayashi [14, Eqn. (76)] in his work on second-order coding rates for channel coding. 
Hence it remains (essentially) to bound /x„(y2 )■ By applying the union bound and exploiting symmetry we have 

^„(y^) < min{l, expinR')Tniy^)} (34) 

where 

Tniy^) := P(/(C/"(2|1) A Y,^iJ)\X^il)) 

> /([/"(111) A 1T(J)I^2"(1)) I ([/"(1|1),X2"(1)) = (n",x^),y2"(j) = ^2)- (35) 

The only randomness now is in C/"(2|l). Denote the conditional type of 7/2 given Xg as Py,u^^. Now consider 
reverse channels V : y2 >i '^2 ^ l^ such that ^ V{u\y2,X2)Py^^\xi^{y2\x2) = Qu\X2{'^\^2), i-c, they are marginally 
consistent with Qu\X2- Denote this class of reverse channels as ^{Qu\X2)- ^^ have, 

rniy^)= E P([/"(2|l)er^(y?,x?)). (36) 

yer„(W;P„jl,jQx2)n^(Q[/|X2) 

By Lemma 3 (with the identifications P ^ Qx2, V ^ Qu\X2^ ^' ^ -fyjl^j ^^'^ ^ ^ ^)' for ^^^ ^ ^ ^{Qu\x.^^ 

P([/"(2|l) e Tyiyl.A)) < exp(-n/(P,j|,j,y|QxJ). 



As a result. 



tM) < 



E 



exp(-n/(Pyj|^.j, yiQxs)) 



Ve'r„iU;Py^l^^Qx2)nmQuix2) 



< 



E 



exp(-n/(u"Ay^ix^)) 



VGr„(W;P«j|.jQx2)n,«(Q„|X2) 

= exp(-n/(u"Ay2"k2)) 
Plugging this back into (34) yields, 

finiy^) < min{l,exp(-n(/(^." A y^\x^) - R'))} 



= exp 
Plugging this back into (33) yields, 

/3„<5;H^?,l^x,(y2"l«",x^)exp 



-n 



iiu^Ayl^\x^)-R' 



-n i{u''Ay^\x^)-R' 



'f\ + 



< 



E Ty^,|[7x.('7VK>^2)k",2;^)exp [-n |/(Q^|x„ FlQxJ - /?' 

VGr„{y2;Qux2) 

Y^ exp{-nDiV\\WY2iux2\Qux2))exp Ln |/(Qf,|^,, Fig^J - i?' 



ve'r„{y2;Qux2) 
exp 



n mm I D(y||VFy,|f;^JQc/xJ + |/(Qc/|x., l^lQxJ - /?' 

V sXi (3^2 ;<y 17x2) 



./| + 



(37) 
(38) 

(39) 

(40) 

(41) 
(42) 

(43) 
(44) 
(45) 

(46) 



By appealing to continuity of exponents [6, Lem. 10.5] (minimum over conditional types is arbitrarily close to the 
minimum over conditional distributions for n large enough), we obtain the exponent F{R') in (12). ■ 



IV. Compress-Forward (CF) 

In this section, we state and prove an achievable error exponent for CF. CF is more complicated than PDF because 
the relay does vector quantization on the channel outputs Y2 and forwards the description to the destination. This 
quantized version of the channel output is denoted by Y2 € 3^^ and the error here is analyzed using covering 
techniques Marton introduced for deriving the error exponent for rate-distortion [22]. Subsequently, the receiver 
decodes both the bin index and the message. This combination of covering and packing leads to a more involved 
analysis of the error exponent that needs to leverage on ideas in Kelly-Wagner [11] where the error exponent for 
Wyner-Ziv was derived. It also leverages on a recently-developed proof technique by Scarlett-Guillen i Fabregas [12] 
to analyze the error when two indices are to be simultaneously decoded given a channel output. At a high level, we 
operate on a conditional type-by-conditional type basis for the covering step at the relay. We also use an a-decoding 
rule [21] for decoding the messages and the bin indices at the receiver. 

This section is structured as follows: In Section IV-A, provide basic definitions of the quantities that are used 
to state and prove the main theorem. The main theorem is stated in Section IV-B. A detailed set of remarks to 
help in understanding the quantities involved in the main theorem is provided in Section IV-C. Finally, the proof 
is provided in Section IV-D. The notation for the codewords follows that in El Gamal and Kim [3, Thm. 16.4]. 

A. Basic Definitions 

Before we are able to state our result as succinctly as possible, we find it convenient to first define several quantities 
upfront. For CF, the following types and conditional types will be kept fixed and hence can be optimized over 
eventually: input distributions Qx^ G ^„(A'i), Qx^ G '^n{^2) and test channel Qy^iy^^a ^ ^(3^2; Qy^lXaQxa) 
for some (adversarial) channel realization Qy2\X2 ^ ^(3^2; Qxa) to be specified later. 

1) Auxiliary Channels: Let the auxiliary channel Wq^ : -^2 — ^ 3^2 be defined as 

WQ^^{y2,y3\x2) ■='^W{y2,y3\xi,x2)Qx,{xi). (47) 

This is simply the original relay channel averaged over Qx^ ■ With a slight abuse of notation, we denote its marginals 
using the same notation, i.e., 

WQ,^{y3\x2) ■.= ^WQ,^{y2,y3\x2), (48) 

WQ,^iy2\x2) ■.= Y,WQ,^{y2,y3\x2)- (49) 

Define another auxiUary channel Wq^^^^^^q^^^^^^^ : -^i x ;f2 ^ 3^2 x 3^3 as 

WQY2ix2,Qt2i^2X2^m,y3\xi,X2) := J]]VF(y3|xi,X2,y2)Qy^|y^x.(y2|y2,a;2)<5y,|x.(y2la;2)- (50) 

J/2 

This is simply the original relay channel averaged over both channel realization Qy^ 1x2 and test channel Qy ly x ■ 
Hence, if the realized conditional type of the relay input is Qy2\X2 and we fixed the test channel to be Qy ly x 
(to be chosen dependent on Qy^iXa)' then we show that, effectively, the channel from Xi x X2 to 3^2 x 3^3 behaves 
as Wqy^^^^^q^ ^ . We make this precise in the proofs. See the steps leading to (115). 

2) Other Channels and Distributions: For any two channels Qy^\x2-, QyzIXz '■ ^^2 — > 3^2. define two 3^2-niodified 
channels as follows: 

Qy2|X2(y2|a;2) ■=^QY^\y^^X2(y^\y^'^'^'>Qy2\X2iy2\x2), (51) 

1/2 

QY2\X2(y^\^^'> ■=^QY2\Y2,X2'^y^\y^'^-^)QY2\X2iy2\x2)- (52) 

?/2 

Implicit in these definitions are Qy .y x ' Qy2\X2 and Qy^ 1x2 but these dependencies are suppressed for the sake 
of brevity. For any F : ^1 x ^2 x 3^2 — > 3^3> let the induced conditional distributions Vq^ : ^^2 x 3^2 ^^ 3^3 and 



Qy :y X V : Xi X X2 ^ y2 >^ ys be defined as 






^Qxi(y3|a;2,2/2) := > V'(y3la:i,a;2,y2)Qxi(2;i), (53) 



Xi 



(Qy2\X2 ^ ^)iy2,y3\xi,X2) := V{y3\xi,X2,m)Q%\x.,iy2\x2)- (54) 

3) Sets of Distributions and a-Decoder: Define the set of joint types P-^ x yy ^^"^ marginals consistent with 
Qx,,Qx2 and Qy^,^^ as 

^n{Qx,,Qx.,Qy^\X.)-.= {Px,X.%Y.^'^n{X^xX2y<%^yz)-.{Px,,Px.,PY,\X.) = {Qx,,Qx.,Qy^\X.)]. (55) 

We will use the notation ^{Qxx,Qx2^Qy \x ) (without subscript n) to mean the same set as in (55) without the 
restriction to types but all distributions in ^{Xi x X2 x y2 x 3^3) satisfying the constraints in (55). For any four 
sequences {x^,X2,y2,y3), define the function a as 

aix'l,yly^\x^) = a{P,V) := D{V\\Wq,^^,^,q^^^^^JP) + H{V\P), (56) 

where P is the joint type of (x", X2 , y2), V '■ Xi x X2 x y2 ^ ys is the conditional type of 2/3 given (x", X2 , ^2 )> 
and Wqy^^^^^q^ ^ is the channel defined in (50). Roughly speaking, to decode the bin index and message, we 
will maximize a over bin indices, messages and conditional types Qy2\X2- This is analogous to maximum-likelihood 
decoding [21]. Define the set of conditional types 

'^n{QY2\X2^QY2\Y2X2^ •= i^ ^ '^aiy^; Qx,Qx2Qy2\X2^ • 

a{Qx^Qx2QY2\X2^^'> - «(<3xi(5x.Q5>|x,'W^Qi-,ix,,Q*,|i-3xJ}- (5'7) 

Note that Qyix ^^ given in (51) and is induced by the two arguments of J^. Intuitively, the conditional types 
contained in J(^n{QY2\X2^ Qy \Y X ) ^^^ those corresponding to sequences y^ that lead to an error as the likelihood 
computed with respect to V is larger than (or equal to) that for the true averaged channel Wq^^^^^^q^ \y x ■ "^^^ 
marginal types Qx^ , Qx2 are fixed in the notation J^ but we omit them for brevity. We will use the notation 
^{Qy2\X2^Qy |yx ) (without subscript n) to mean the same set as in (57) without the restriction to conditional 
types but all conditional distributions from X\X X2X 3^2 to 3^3 satisfying the constraints in (57). 

B. Error Exponent for Compress-Forward 

Theorem 5 (Error Exponent for Compress-Forward). Fix 6 G N and "Wyner-Ziv rate" R2 > 0, distributions 

Qxi £ ^{Xij and Qx2 ^ ^{X2) and auxiliary alphabet 3^2- We have 

E,{R,s) >^mm{Gi{R,R2),G2{R,R2)} (58) 

where the constituent exponents are defined as 

GiiR,R2) := , min I)(F||T^q,JQxJ + \I{Qx2,V) - R2\^ (59) 

where Wq^ is defined in (48) and 

G2{R,R2):=^ min D{Qy2\X2\\Wq,JQx2)+ max _ J(ii,fi2,gy,y^x.>Qy.|xJ- (60) 

QY2\X2-^2^y2 Qy-^fY2X2-y2XX2^y2 ' 

The quantity J{R, R2, Qy ly x ' ^^21X2) ^^'^^ constitutes G2{R-, R2) i^ defined as 

J{R, R2, Q%\Y^X2'Qy2\X2) 

•= P ^mo'' 00 O ^^^^^^3|X,X2 II ^Q-2lX2,Q*-2,.2.X2 I ^^1 ^^^ ) 

i^XiX2Y2Ya^-^KQXi,Qx2,QY2\X2) I 



+ mill min 

QY2\X2-X2^y2 ^^■^{Qy2\X2:Qy2\Y2X2) 



mill Vi (y, Qy^ \X2 ,R,R2, Px,X2Y2Y, ) 



(61) 
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where Wq^ ^^ ,Qy \y x ^Qy\X' ^'^'^ Qy \x '^^^ defined in (50), (51) and (52) respectively and the functions 
ipi,l = 1,2 are defined as 



MV,Qy,\x.,R,R2,Px,x,y.yJ ■■= \HQx.,Q%ix. X v\Qx.)-R\ 



(62) 



and 



MV,Qy.\x.,R,R2,P- 



yCi A 2 12 ^3 



\iiQ%\x.,VQ,JQx.) + \i{Qx.,Q%\x. xV)-R\ 



+ R2-I{Qy,\x,,Qy,\y,,xJQx.)\^ 

if i?2 < I{Qy2\X2,Q%\y„xJQx,) 

IiQ%\x..^VQ.JQx.)+\IiQx,,QY,\x.xV\Qx.)-R\ 
if R2> I{Qy2\x„Qy,\y„xJQx2) 

where Vq^^ and Qy \x ^^ ^^^ defined in (53) and (54) respectively. 
Note that (63) can be written more succinctly as 

1p2{V, Qy2\X2iR^ -^2, -PxiXa-f-aya) 

nQ%\x,,VQ,JQx.) + \iiQx,,QY,\x. X y) - R\+ - \iiQY.\x.,Q%\Y.,xJQxJ - R2 1+ 



(63) 



(64) 



C. Remarks on the Error Exponent for Compress-Forward 

In this Section, we dissect the main features of the CF error exponent presented in Theorem 5. 

1) We are free to choose the independent input distributions Qx^ and Qxi^ though these will be n-types for 
finite n € N. We also have the freedom to choose any "Wyner-Ziv rate" R2 > 0. Thus, we can optimize over 
Qxi,Qx2 ^nd i?2- The Xi- and X2-codewords are uniformly distributed in Tq,^ and Tq^ respectively. 

2) As is well known in CF [2], the relay transmits a description ^2 0) ^^ ^^^ received sequence 2/2 (i) (conditioned 
on X2(i) which is known to both relay and decoder) via a covering step. This explains the final mutual 
information term in (63) which can be written as the rate loss /(I2; ^2|-'^2)> where (X2, I2) ^2) is distributed 
as Qx2 X Qy2\X2 X Qy \y x • Since covering results in super-exponential decay in the error probability, this 
does not affect the overall exponent since the smallest one dominates. See the steps leading to (82) in the 
proof. 

3) The exponent Gi{R, R2) in (59) is analogous to G{R') in (13). This represents the error rate in the estimation 
of X2 's index given Yg" using MMI decoding. However, in the CF proof, we do not use the packing lemma. 
Rather we construct a random code and show that on expectation (over the random code), the error probability 
decays exponentially fast with the exponent given by Gi{R, i?2)- 

4) In the exponent G2(-R, -R2) in (60), Qy2\X2 is the realization of the conditional type of the received signal 
at the relay 2/2 (j) given XgO). The divergence term -^(Qy^iXsll^Qxi IQxa) represents the deviation from 
the true channel behavior Wq^_^. We can optimize for the conditional distribution (test channel) Q^ .y x 
compatible with Qy2\X2Qx2- This explains the inner maximization over Qf \y x ^^^ outer minimization 
over Qy2\X2 ^^ (60)- This is a game-theoretic-type result along the same lines as in [11], [15]. 

5) The first part of J given by V'l in (62) represents the incorrect decoding of the index of Xf (message Mj) as 
well as the conditional type Qy2\X2 given that the bin index of the description ¥2" is decoded correctly. The 
second part of J given by -02 in (63) represents the incorrect decoding the bin index of 1^", the index of Xf 
(message Mj) as well as the conditional type Qy2\X2- ^^ see the different sources of "errors" in (61): There is 
a minimization over the different types of channel behavior represented by P^ x yy ^^'^ ^^^° ^ minimization 
over estimated conditional types Qy2\X2- Subsequently, the error involved in a-decoding of the message and 
the bin index of the description sequence is represented by the minimization over V € ^{Qy2\X2jQy \y x )• 

6) We see that the freedom of choice of the "Wyner-Ziv rate" -R2 > allows us to operate in one of two distinct 
regimes. This can be seen from the two different cases in (63). The number of Wyner-Ziv bins is designed to 
be exp{nl {Qy2\x2j Qy \y x \Qx2)) to first order in the exponent, where the choice of Q^ .y x. depends on 
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the realized conditional type Qy^iXa- Thus, when R2 < I{Qy2\X2^Qy \Y x \Qx2)^ we do additional Wyner- 
Ziv binning as there are more bins than description sequences. If R2 is larger than I{Qy^\x2^ Qf \y x \Qx2)^ 
no additional binning is required. 

7) For the analysis of the bin and message indices, if we simply apply the packing lemmas in [6], [9], [10], [21], 
this would result in a suboptimal rate vis-a-vis CF. This is because the conditional correlation of Xi and Y2 
given X2 would not be taken into account. Thus, we need to analyze this error exponent more carefully using 
techniques introduced in [12] for the multiple-access channel. Note that the first two mutual informations 
(ignoring the | • |+) in (63) can be written as 

I{Y2-Y^\X2) + I{Xi-Y2,Yi\X2) 
= H{Xi\X2) + H{Y2\X2) + H{Ys\X2) - H{Xi,Y2,Ys\X2), (65) 

where (Xi,X2, 12,13) ~ QxiXQx2>^Qy \x ^^- The entropies in (65) demonstrate the symmetry between 
Xi and I2 when they are decoded jointly at the receiver I3. In fact, the proof shows that by modifying the 
order of applying union bounds, we can get another achievable exponent [12]. 

8) From the exponents in Theorem 5, it is clear upon eliminating R2 (if R2 is chosen small enough so that 
Wyner-Ziv binning is necessary) that we recover the CF lower bound in (2). Indeed, if ^1 is active in the 
minimization in (61), the first term in (2) is positive if and only if the error exponent G2 is positive for some 
choice of distributions Qxi, Qx2 ^rid Q^ ly ^ . Also, if 1^2 is active in the minimization in (61) and R2 is 
chosen sufficiently small (so that the first clause of (63) is active), G2 is positive if 

R<R2 + I{Xi-Y2,Yi\X2) + I{Y2;Y^\X2) - /(Ya; 1^21^2) (66) 

</(Xi; 1^2, 1^31^2) + /(1^2;X2,n)-/(l2; 1^21^2) (67) 

= /(Xi,X2;F3)-/(l'2;1^2|^i,^2,1^3), (68) 

for some Qxt, Qxa and Qy \y x ■ ^^ i^^)^ we used the fact that Gi in (59) is positive if and only if 
R2 < /(X2; Y3) and in (68) we also used the Markov chain y2-(-'^i, ^2)-(l2, I3) [3, pp. 402]. Equation (68) 
matches the second term in (2). If we had used the standard packing lemmas in [6], [9], [10], [21], we would 
not observe this positivity when rate is below the CF lower bound. 

9) Lastly, the evaluation of the CF exponent may be tricky because of the multiple nested optimizations. It 
is, however, not apparent how to simplify the CF exponent to make it amendable to evaluation for a given 
DM-RC W. 

D. Proof of Theorem 5 

Proof: Random Codebook Generation: We again use block-Markov coding [2]. Fix types Qx^ S ^„(^i) 
and Qx2 £ £^ni^2) as well as rates R,R2 > 0. For each j E [b], generate a random codebook in the follow- 
ing manner. Randomly and independently generate exp(ni?) codewords x'^{mj) ~ Unif[7Q^ ], where Unif[^] 
is the uniform distribution over the finite set A. Randomly and independently generate exp(ni?2) codewords 
x'^{lj-i) ~ Unif[7^^J. Now for every Qy2\X2 ^ '^n{y2]Qx2) Ax a different test channel Qy^jy^ ^^^(Qy^lXa) S 
'^n{y2\QY2\X2Qx2)- For every Qy^iXa ^ ^(3^2; Qxa) and every X2(^j-i) construct a conditional type-dependent 
codebook B{QY^\x2^h-i) ^ 3^2^ of integer size \B{QY^\x2jh-^)\ whose rate, which we call the inflated rate, 
satisfies 

R2{Qy2\X2) ■■= -^Og\B[QY2\X2,h-l)\ = I{QY2\X2,QY2\Y2,X2^QY2\X2)\QX2)+l^n. (69) 

where Un G 0("^^) and more precisely, 

(|^2||y2||y2|+2)log(n + l) ^ ^^ ^ (|>y2||y2||y2| + 3)log(n + l) ^^^^ 

n ~ ~ n 

Each sequence in B{QY^\x2^h-^) ^^ indexed as y2(%Ki-i) and is drawn independently according to the uniform 
distribution Unif[7Q^^|^^ (x2 (/j-i))] where Qy\x2 ^^ *^ marginal induced by Qy^iXj and Qy^\y^^x2^Qy2\X2)- See 
the definition in (51). Depending on the choice of R2, do one of the following: 
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• If -^2 < R2{Qy2\X2)' partition the conditional type-dependent codebook ^^(Qy^iXa, ^j-i) into exp(ni?2) equal- 
sized bins Bi^{Qy2\X2Jj-i)^ h ^ [exp(nit'2)]- 

• If i?2 > R2{Qy2\X2)^ assign each element of ^^(Qy^iXai^i-i) ^ unique index in [exp(ni?2)]- 
Transmitter Encoding: The encoder transmits Xi{mj) at block j € [b]. 

Relay Encoding: At the end of block j G [b], the relay encoder has X2{lj-i) (by convention Iq := 1) and its 
input sequence 2/2 (i)- It computes the conditional type Qy2\X2 ^ ^(3^2; Qxa)- Then it searches in ^(QyaiXa) ^j-i) 
for a description sequence 

mj\lj-i) e rQ,^,,^,^^(Q.,,.,)(y2"a),4(/,-i)). (71) 

If more than one such sequence exists, choose one uniformly at random in B{Qy2\X2j^j-i) from those satisfying (71). 
If none exists, choose uniformly at random from B{Qy^\x.^, ^j-i)- Identify the bin index Ij of ^2 (%l^i-i) and send 

X2[lj)- 

Decoding: At the end of block j + 1, the receiver has the channel output 2/3 (j + 1). It does MMI decoding [5] 
by finding Ij satisfying 

Ij := argmax lix^ilj) A y^ij + 1)). (72) 

ije[cxp(ni?,2)] 

Having identified Ij-iJj from (72), find message rhj, index kj and conditional type Qy\x ^ ^(3^2; Qxa) 
satisfying 

{mj,kj,Q'^^^^J= argmax _ a (^x5^(mj),y^(/cj|f,_i),yj(j)|x5(/j_i)) , (73) 

{■mj,kj,QY2\X2)-y5{>'3\h-l)&Sl.(QY2\X2,lj-l) 

where the function a was defined in (56). This is an a-decoder [21] which finds the {rhj^kj^Qy \x ) maximizing 

a subject to y2(%Kj-i) ^ "^f (^y ix '^i-i)' where lj-i,lj were found in (72). Declare that rhj was sent. 

Analysis of Error Probability: We now analyze the error probability. Assume as usual that Mj = 1 for all 
j € [b — 1] and let Lj^i,Lj and Kj be indices chosen by the relay in block j. First, note that as in (21), 

P(M / M) < (6 - 1) (eR + 2eD,i + €0,2) , (74) 

where eR is the error event that there is no description sequence y2(^il^j-i) ^^ ^^^ t)in ^^(Qy^iXa) ^i-i) that 
satisfies (71) (covering error), 

eD,i := P(4- / Lj) (75) 

is the error probability in decoding the wrong lj bin index, and 

eD,2 := ^{Mj / l\Lj,Lj^i decoded correctly) (76) 

is the error probability in decoding the message incorrectly. See the proof of compress-forward in [3, Thm. 16.4] 
for details of the calculation in (74). Again, since 6 is a constant, it does not affect the exponential dependence 
on the error probability in (74). We bound each error probability separately. Note that the error probability is an 
average over the random codebook generation so, by the usual random coding argument, as long as this average is 
small, there must exists at least one code with such a small error probability. 

Covering Error eR: For cr, we follow the proof idea in [11, Lem. 2]. Regardless of the realized conditional 

type Qy, IX. er„(3^2;QxJ, 

6R < n^\Y2^ = Vl. X^ = A.VI e TQ,2^X2 (^2 )), (77) 

where T is the event that every sequence y^{kj\lj^\) G ;B((5y i^^j/j-i) does not satisfy (71). Now we use the 
mutual independence of the codewords in -^(Qy^ix, , ^j~\) and basic properties of types (Lemmas 1 and 2) to assert 



that 
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^R = l[F{Y,-{k,\L,.,) i Tq^^^^^^^^^q^^,^JY,-{j),XUI,.,))) (78) 



< 



\B{Qy21X2,Ij-i)\ 

\B{Qy21X2,Ij-1 



(79) 
(80) 



1 _ (n + 1)-I^^ll3^^lli'^l eM-nIiQY2\X2,QY2\Y2,X2(QY2\X2)\Qx. 

<e-("+i)', (82) 

where the product in (78) extends over all indices kj for which y2{kj\lj_i) G ^(Qy^iXs, ^j-i) for some fixed 
realization of Ij-i which we can condition on. Inequality (81) follows from the inequality (1 — x)'^ < e"'^^, 
(82) follows from the choice of R2{Qy2\X2) ^^'^ ^n in (69) and (70) respectively. This derivation is similar to 
the covering lemma for source coding with a fidelity criterion by Marton [22]. The punchline is that er decays 
super-exponentially (i.e., er = or the exponent is infinity) and thus it does not affect the overall error exponent 
since the smallest one dominates. 

First Packing Error eD.i: We assume Lj = 1 here. The calculation here is very similar to that in Section III-C 
but we provide the details for completeness. We evaluate e^ i by partitioning the sample space into subsets where 
X^(l) takes on various values Xg € Tq^ ■ Thus, we have 



eD,i < P (3/,- / 1 : liX^lj) A Y.^U)) > /(^^"(l) A Y.^U))) (83) 

= E Pf^/3n(^2) (84) 

where 

/3„(x^) := F (3 /,■ / 1 : i{X^{l,) A Y^{j)) > i{Xl^{l) A Y^{j)) \ Xl^{l) = x^) . (85) 

It can easily be seen that /3„(x2) is independent of Xg G Tq^ so we abbreviate /^^(xj) as /?„. Because X"(l) is 
generated uniformly at random from Tq^ , we have 

where 

^„(y3") := P (3 [, / 1 : i{X^{lj) A Y,^{j)) > i{X^{l) A Y,-{j)) \ Y^{j) = y^ ^^"(l) = x^) . (87) 

We continue to bound /3„ as follows: 

Pn< Yl (" + l)'*'exp(-n/7(QxJ)5;i^"(y3"k^^2K(y3") (88) 

= Y. (n + l)l^^lQ^,(x^)5;TV"(y3"|xr,x?)^n(y3") (89) 

< (n + 1)1^^1 J]^„(y3")j;g^,(x^)l^"(y3"K,x5) (90) 

= (n + 1)1^^1 J] Ty5^^(y3"|x5)/i„(y3"). (91) 

yS 
where (88) follows from the lower bound on the size of a type class (Lemma 1), (89) follows from the fact that 
the Q"^ -probability of a sequence x" of type Qx^ is exactly ex.p{—nH [Q xj) and (91) is an application of the 
definition of Wq^ in (48). It remains to bound finiUs) in (87). We do so by first applying the union bound 

l^niVs) < min {1, exp(ni?2)r„(y^)} , (92) 
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where 

Tn{y^) := P [iiX^{2) A Y,^{j)) > i{X^il) A Y,-{j)) \ Y^{j) = y^,Xl^{l) = xi) . (93) 

We now use notation 1/ : J's — )• ^2 to denote a reverse channel. Also let Py^ be the type of 7/3. Let M{Qx2) be 
the class of reverse channels satisfying ^ y{x2\yi,)Pyr^{y'i) = Qx2{'X2)- Then, we have 

rn(2/3")= E P(X2"er^(y3")), (94) 

V^er„(A'2;P„j)n.^{QxJ: 
/>JAyJ)</{P„j,t/) 

where Xg is uniformly distributed over the type class Tq^ . From Lemma 3 (with the identifications Xi •<— 0, 
V ^ Qx2, y ^ PyS' W ^ V), we have that for every V £ ■^(QxJ, 

P {X? e Ty{y^)) < exp(-n/(P,j, y)). (95) 

Hence using the clause in (94) yields 

r„(y-)<exp(-n/(x^Ay3")). (96) 

Substituting this into the the bound for /in (2/3) in (92) yields 



(97) 



(98) 



f,n{y^)<exp[-n\Iix^Ay^)-R2\ 
Plugging this back into the bound for /?„ in (91) yields 

(3n<Yl ^Qx, (ya 1^2) exp [-n\i{x^ A ^3") - i?2|+" 

= E W^^^{Tv{x^)\x^)exp[-n\I{Qx2,V)-R2\+] (99) 

< E exp[-n(D(F||T^Q,JQxJ + |/(Qx.,l^)-i?2r)]. (100) 

VGr„(y3;QxJ 

This gives the exponent Gi{R, R2) in (59) upon minimizing over all V G '^(J^s; Qxa)- 

Second Packing Error £0,2= We evaluate eD,2 by partitioning the sample space into subsets where the conditional 
type of relay input 7/2 given relay output X2 is Qy^iXa ^ ^(3^2; Qxs)- That is, 

eD,2= E n>?erQ,^,x,(^2"))9'n(Qy,ixJ (101) 

where <^n{QY2\X2) i^ defined as 

MQy2\X2) ■■= P (Mj / 1 1 L,, L,_i decoded correctly, Y^ e Tq,^,,^ (X2")) . (102) 

We bound the probability in (101) and ^n{QYo\X2) in the following. Then we optimize over all conditional types 
Qy2\X2 ^ ^(3^2; Qxa)- This corresponds to the minimization in (59). 

The probability in (101) can be first bounded using the same steps in (88) to (91) as 

n^2"€rQ.^,.^(X2"))<(n + 1)1^^1 Y. YF—\ E w^.Sy2\^2) (io3) 

Now by using Lemma 1, 

nY^ e Tq,^,,^ (X2")) < exp [-nD{Qy^\x2 Wq,^ |QxJ] • (104) 

Recall the notations ^n{QxnQx2^Q%\x ) ^'^'^ =>^(Qy2|^2' Qy ly x ) ^"^ ^^^^ ^'^'^ ^^^^ respectively. These sets 
will be used in the subsequent calculation. Implicit in the calculations below is the fact that Y2 G Tqy^^x2 {^2) ^nd 
also for the fixed Qy2\X2 ^^ have the fixed test channel Qy \y x (^^'21X2) which we will denote more succinctly 
as Qy \Y X ■ ^^^ (^^-^ ^^^ '■1'^ codebook generation. In the following steps, we simply use the notation Tp^ x y y 
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as an abbreviation for the event that the random quadruple of sequences {Xf{l),X2{Lj^i),Y2'(Kj\Lj^i),Yr^{j)) 
belongs to the type class Tp^ x y y. ■ Following the strategy in [12], we now bound <^n{QY2\X2) i'^ (102) by 
conditioning on various joint types P^^^^^^y^ e =^n(<5xi, Qxa^Qy^Xa) 



Px,X2Y2Y,(^^r.{Qx,,Qx2,QY2lX:,) \ V'e.jr„ (Qy^^ IX^ ,Qi>2 | ^.X^ ) 



Tp \ , (105) 

^XiX2Y2Y3 I ' "- ^ 



where Qy ix is specified in (51) based on Q^ .y x ^^'^ Qy2\X2 ^^'^ the event Sy is defined as 

£v:= U £v{Qy2\X2), (106) 

QY2lX2&-K,(y2;Qx2)\{QY2lX2} 



with the constituent events defined as 



£viQY2\X2)-= U U £v{QY2\X2,rhj,~kj), (107) 

rh,e[exp{nR)]\{l}k^^Bi.{QY2lX2,Lj-i) 



and 



£v{QY2\X2,rh„kj) := [[xnm,),X'^{L,.^),Y,-{k,\L,^,),Yi^{j)) € TQ^^g^^Q.^^^^y} • (108) 

Recall the definition of Q^ i^ in (52). This is a function of the decoded conditional type Qy^iXa- Note that Qy^iXa 
indexes an incorrect decoded conditional type (only polynomially many), rhj indexes an incorrect decoded message 
and kj indexes a correctly (kj = Kj) or incorrectly decoded bin index (kj ^ Kj). The union over kj extends over 
the entire bin B^^ {Qy^\x2^^3-i) ^^^^ i^ot only over incorrect bin indices. This is because an error is declared only 
if rhj 7^ 1. Essentially, in the crucial step in (105), we have conditioned on the channel behavior (from Xi x X2 to 
3^2 X 3^3) and identified the set of conditional types (indexed by V) that leads to an error based in the a-decoding 
step in (73). 

Now we will bound the constituent elements in (105). Recall that X" is drawn uniformly at random from Tq^ , 
X2 is drawn uniformly at random from Tq^^ and Y2 is drawn uniformly at random from Tq^ ^ (2/2 > ^2 ) where 
2/2 ^ Tqy^ix2 (^2 ) ^'^^ Qy2\X2 is the conditional type fixed in (101). Note that if we are given that Y.^ G Tq^^^^^ (2^2)' 
it must be uniformly distributed in Tqy^^^ {X2) given X2 = X2- Finally, Y^ is drawn from the relay channel 
W"'{ ■ \y2,x'^,X2)- Using these facts, we can establish that the first probability in (105) can be expressed as 

where 

mx-x-)- V 1 V ^"fe"i^2",^r,4) .110. 

^?er,,.,x.(-?) '^^^='^=^ ^^' (Sj,.3")er.,,.3,x,x, (-•"'-■?) "^«-^i-=^^^^2,^2)l 

V2&Tq iyS,^2) 

Y2\Y2X2 

Notice that the term \TQY^^j^^{x2)\~^l{y2 G "701-21^2(^2)} ^^ (HO) indicates that Y2 is uniformly distributed in 
Tqy |x (^2) given that X2 = X2- We now bound t?(x",X2) by the same logic as the steps from (88) to (91) 



16 



(relating size of shells to probabilities of sequences). More precisely, 



Y^ Ql.y^^^{y^\y^,x^)W^{yr^\ylx-„x^) 









1-2 Vg 1X1X2 



E 






^Qv2ix2,Qr2ir2,x2 ^y^' ysl^i^ ^2) 



(111) 

(112) 
(113) 



(y5,yS)&Tp^^^^^^^^J,^i,x5) 



exp 



-^D(.PnY,\X^X, II ^Qv-2|X2,Qr2|V2.X2 I Q^IQX 



(114) 

where (112) follows by dropping the constraints 2/2 S Tq^^^^^ (^2 ) and 2/2 ^ Tqj- ,5, ^ (^2 ' ^2 ) ^'^'i reorganizing the 
sums, (113) follows from the definition of Wqy^^^^^^q^ ^ in (50) and (114) follows by Lemma 1. Substituting 
(114) into (109) yields the exponential bound 



^XiX2-f-2 5'3 ' — ^ 



-nD{P, 



I2 5^3 I ^1 ^2 



Wq, 



V2|X2.Q-f-2|Y2,X2 



Qx^Q 



X2 



(115) 



Hence, all that remains is to bound the second probability (of the union) in (105). We first deal with the case where 
the decoded bin index kj is correct, i.e., equal to Kj. In this case, the X^ codeword is conditionally independent 
of the outputs {Y2,Y^) given X2. We have by the definition of 6v{Qy2\X2j''^j^^j) ^^ (108) that 



£v{Qy2\X2^^j^Kj 



Tp 



X1X2V2Y-3 



(M^ys) e Tq. ,, xy(xrK),X2"(L,_i)) 



Tp 



^1^2'>'2'>'3 



(116) 



where we have used the bar notation (^21^3) to denote an arbitrary pair of sequences in the "marginal shell" 
induced by W{y2,y3\xi,X2) ■= J2xi Qxi(2;i)Qy^|jf^(y2k2)^(y3|2;i, 3:2,^2)- We can condition on any realization 



of XJ(Lj_i) = X2 G 7qx2 h^re. Lemma 2 (with identifications P ^ Qx2, ^ ^ Qx^, V 
W ^W) yields. 



Q 



Y2\X, 



£v{Qy2\X2^^J^Kj] 



Tp 



X,X„Y^Y^ 



exp 



-nI{Qx,,Q%\X2 ^^\Qx. 



X V, and 



(117) 



and so by applying the union bound (and using the fact that probability cannot exceed one). 



U £v{QY2\X2^mj,Kj) 

,mje[exp(ni?)]\{l} 



Tp 



X1X2Y2Y3 



< 



exp 



-n 



I{Qx^,Qy2\X2 x^IQx2 



R 



(118) 



This corresponds to the case involving ipi in (62). 

For the other case (i.e., ip2 in (63)) where both the message and bin index are incorrect (rhj ^ 1 and kj ^ Kj), 
slightly more intricate analysis is required. For any conditional distribution Qy^ 1x2' define the excess rate 



^R2{Qy2\X2) ■■= MQY2 



\X2 



R2 



(119) 



where the inflated rate R2{Qy2\X2) i^ defined in (69). Assume for the moment that IS.R2{Qyax2) ^ 0- Equivalently, 
this means that R2 < I{Qy2\X21 Qf \Y x \Qx2) + ^n, which is, up to the fn G 0("^^) term, the first clause in (63). 
Again using bars to denote random variables generated uniformly from their respective marginal type classes and 
arbitrary sequences in their respective marginal type classes, define as in [12] 



UV,Qy2\X2) ■■= exp{nAR2{QY2\X2))-^[ix^,y2,y3) e Tg^^g^^^^^y^^^ 
UV,Qy21X2) ■■= exp(nfi) -P fe,^2",^r,y3") e V2Qx,(Q. ,. xV) 



(120) 
(121) 
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where the conditional distributions Vq^_^ and Qy \x ^ ^ ^^^ defined in (53) and (54) respectively. By applying 
the union bound one at a time to the two unions in (107) as was done in [12], we obtain 



£v{Qy,\X,) I Tp^,x,*-,^3 ^ ^niV, Qy,\ 



\X2 



(122) 



where 



lniV,QY,lx.) :=min{l,^„(y,gy,ixj •min{l,Cn(V^,Qy,lxJ}} • (123) 

Now by using the same reasoning that led to (118) (i.e., Lemma 2), we see that (120) and (121) evaluate to 

UV,Qy.\x.) = exp [-n (/(Qy^,^^, Fq^JQxJ - Ai?2(Qy,|xj)] (124) 

Cn{V,QY,\xJ = exp \-n (i{Qx,,Qy,\x, x V\Qx,) - R)] (125) 



Hence, 'YnV^, Qy2\X2) ^^ (123) has the following exponential behavior: 



7n(^,QyJX2) = exp 



-n 



^(Qy,|x.'^QxJQxJ-Ai?2(gy,|xJ + HQx^Q 



'Y.\X-2 



V\Qx.J-R 



(126) 



Now consider AR2{Qy2\X2) < 0- Equivalently, this means that i?2 > HQy2\X2-:Qy \y x \Qx2) + ^n, which is, up 
to the Un G 0(-^^) term, the second clause in (63). In this case, we simply upper bound exp(nAi?2(Qy2|^2)) ^y 
unity and hence, ^n{y,QY2\X2) ^^ 



CniV, Qy2\X2) < exp -nI{Qy,x^,VQ^^ \Qx2 



(127) 



and this yields 



ln{V,QY2\X2) < exp 



-n 



I{Qy2\X2^Vq,.\Qx^) + \^(Qx^,Qy2\x2 X ^\Q^^) - ^r) 



(128) 



Uniting the definition of R2{Qy2\X2) ^^ (69), the probabilities in (104) and (115), the case where only the message 
is incorrect in (118), the definition of the excess rate Ai?2(0y2|^2) ^^ (119) and the case where both message and 
bin index are incorrect in (126) and (128) yields the exponent G2{R, R2) in (60) as desired. ■ 



V. Conclusion and Future Work 

In this paper, we derived achievable error exponents for PDF and CF. We discuss a few avenues for further 
research. Since noisy network coding [23] is a variant of CF that generalizes various network coding scenarios, in 
the future, we hope to also derive an achievable error exponent for noisy network coding. We also expect that the 
moments of type class class enumerator method by Merhav [24] can yield an alternate form of random coding and 
expurgated bounds may have a different interpretation (perhaps, from the statistical physics perspective) vis-a-vis 
the types-based random coding error exponent presented in Section III. In addition, a combination of DF and CF 
was used for relay networks with at least 4 nodes in Kramer-Gastpar-Gupta [8]. It may be insightful to derive the 
corresponding error exponents. Finally, upper (sphere -packing) bounds on the reliability function in (6) should be 
derived but to do so requires the derivation of a strong converse for DM-RCs, i.e., one needs to prove that above 
the rate prescribed by the cutset bound [3, Thm. 16.1], the error probability tends to one. It is not cleai^ how to 
prove a strong converse for DM-RCs as the blowing-up lemma [6, Ch. 5] is not directly applicable to systems with 
feedback such as the DM-RC. 
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Appendix A 
Proof of Lemma 2 

Proof: Because X2 is generated uniformly at random from Tv{xi), 

Consider reverse channels V : Xi x y ^ X2 and let ^{V) the be collection of reverse channels satisfying 
'^yy{x2\xi,y)W{y\xi) = V{x2\xi). Note that y" G Tv'{xi,X2) holds if and only if there exists some V G 
yn{X2;P X W)n^(y) such that x^ G Tyixl.y'^). Then we can rewrite (129) as 

where the last equaUty follows from the fact that I{X2;Y\Xi) = H(X2\Xi) - H{X2\XiY). This proves the 
lemma. ■ 

Appendix B 
Proof of Lemma 3 

Proof: Because X2 is uniformly distributed in Tv{xi), we have 

p[X2"Griy(y",x?)]= Y. PFT^i^^2erw^(y",^?)}- (i3i) 

As a result, 

iry(x^) nrH/(y",xy)| ^ |rw(y",xy)| ^ exp(niJ^(VF|p X yp 

^^^^""^^^^ ,xJJ- ^^^^^^^^ - |ry(x?)| -(n+l)-l^^ll^^lexp(ni7(F|P))- ^^^^^ 

Thus, we have 

P [Xl^ G V(y^ x^^)] < exp[-nJ(y', T^jP)] (133) 

in view of the fact that W satisfies the marginal consistency property in the statement of the lemma and 1(^2 ; ^l-'^i) = 
H{X2\Xi) - H{X2\XiY). ■ 
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